AITopics

2605.21552

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningMay-22-2026

MMD-Balls as Credal Sets: A PAC-Bayesian Framework for Epistemic Uncertainty in Test-Time Adaptation

Ariq, Ahanaf Hasan

Reliable deployment of machine learning models requires reasoning under epistemic uncertainty--the ability to recognize when the operating distribution has shifted beyond the scope of what was encountered during training. This challenge is central to test-time adaptation (TTA), a paradigm in which a model pretrained on source distribution Ps receives unlabeled data from a target distribution Pt = Ps at deployment time. Existing TTA methods (Wang et al., 2021; Niu et al., 2023; Zhang et al., 2022a; Yuan et al., 2023; Su et al., 2022) improve accuracy under distribution shift by adapting model parameters using statistics computed from test batches, but they provide no formal guarantees about when predictions should be trusted or how much risk degrades as a function of shift magnitude. This gap is particularly concerning in safety-critical applications such as autonomous driving, medical imaging, and financial risk assessment, where a model that silently degrades under distribution shift can cause significant harm. The inability to quantify how wrong a model's predictions might be in an unseen environment fundamentally limits its trustworthy deployment.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2605.21783

Genre: Research Report (0.40)

Industry:

Information Technology (0.54)
Banking & Finance (0.54)
Transportation > Ground > Road (0.34)
Automobiles & Trucks (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

arXiv.org Machine LearningMay-21-2026

Tippett-minimum Fusion of Representation-space Diffusion Models for Multi-Encoder Out-of-Distribution Detection

Bhuyan, Neelkamal

We address out-of-distribution (OOD) detection across the full spectrum of distribution shifts -- global domain changes, semantic divergence, texture differences, and covariate corruptions -- through a multi-encoder fusion of per-encoder representation-space diffusion models (RDMs). We statistically identify each encoder's sensitivity to specific shift types from ID data alone and introduce EncMin2L -- an encoder-agnostic two-level $\min(\cdot)$-gate that combines and calibrates per-encoder diffusion-based likelihood detectors without OOD labels, outperforming monolithic multi-encoder baselines at $2.3\times$ lower parameter cost. Two ID-data diagnostics: $η^2$ (class-conditional F-test) and $Δμ$ (log-likelihood shift under synthetic corruptions) -- quantify encoder specialization, while a Tippett minimum $p$-value combination aggregates per-encoder scores into a single, calibration-stable OOD signal. EncMin2L achieves $\geq 0.94$ AUROC across all four shift types simultaneously, outperforming the state-of-the-art representation-space diffusion OOD detectors across overlapping benchmarks.

artificial intelligence, cifar-100, machine learning, (17 more...)

2605.20502

Country: North America (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMay-20-2026

FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes

Fries, Anya, Nelson, Jacob A, Jung, Martin, Reichstein, Markus, Peters, Jonas

We introduce FLUXtrapolation, a benchmark for extrapolating ecosystem fluxes under progressively harder distribution shifts. Ecosystem fluxes are central to understanding the carbon, water, and energy cycles, yet they can only be measured directly at sparsely located measurement towers. Producing global flux estimates therefore requires training models on observed sites using globally available covariates and predicting in unobserved regions, that is, upscaling. Flux upscaling is a challenging domain generalization problem that is affected by a shift in covariate distribution across climates, ecosystem types, and environmental conditions, as well as by conditional shift: important drivers remain unobserved at global scale. We provide a quantitative analysis of both these shifts in $P_X$ and $P_{Y\mid X}$. FLUXtrapolation is designed based on domain expertise on flux upscaling: it defines temporal, spatial, and temperature-based extrapolation scenarios and evaluates performance across held-out domains, temporal aggregations, and tail errors. In a pilot study, we find that baselines perform similarly under median hourly RMSE, but separate under the proposed tail-focused and multi-scale evaluation. FLUXtrapolation therefore poses a realistic and thus relevant challenge for machine learning methods under distribution shift; at the same time, progress on this benchmark would directly support the scientific goal of improving flux upscaling.

artificial intelligence, fluxnet product, machine learning, (14 more...)

2605.19812

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (0.81)

Industry: Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Yamamoto, Kakei, Wainwright, Martin J.

TILT: Target-induced loss tilting under covariate shift

arXiv.org Machine LearningMay-15-2026

We introduce and analyze Target-Induced Loss Tilting (TILT) for unsupervised domain adaptation under covariate shift. It is based on a novel objective function that decomposes the source predictor as $f+b$, fits $f+b$ on labeled source data while simultaneously penalizing the auxiliary component $b$ on unlabeled target inputs. The resulting fit $f$ is deployed as the final target predictor. At the population level, we show that this target-side penalty implicitly induces relative importance weighting at the population level, but in terms of an estimand $b^*_f$ that is self-localized to the current error, and remains uniformly bounded for any source-target pair (even those with disjoint supports). We prove a general finite-sample oracle inequality on the excess risk, and use it to give an end-to-end guarantee for training with sparse ReLU networks. Experiments on controlled regression problems and shifted CIFAR-100 distillation show that TILT improves target-domain performance over source-only training, exact importance weighting, and relative density-ratio baselines, with a stable dependence on the regularization parameter.

artificial intelligence, data mining, machine learning, (19 more...)

2605.1428

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining (0.67)

Neural Information Processing SystemsApr-30-2026, 19:47:50 GMT

0084ae4bc24c0795d1e6a4f58444d39b-Paper.pdf

artificial intelligence, estimator, machine learning, (17 more...)

Country: North America > United States > New York (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)
Research Report > New Finding (0.93)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Neural Information Processing SystemsApr-30-2026, 04:22:13 GMT

Towards a Unified Analysis of Kernel-based Methods Under Covariate Shift

Covariate shift occurs prevalently in practice, where the input distributions of the source and target data are substantially different. Despite its practical importance in various learning problems, most of the existing methods only focus on some specific learning tasks and are not well validated theoretically and numerically. To tackle this problem, we propose a unified analysis of general nonparametric methods in a reproducing kernel Hilbert space (RKHS) under covariate shift. Our theoretical results are established for a general loss belonging to a rich loss function family, which includes many commonly used methods as special cases, such as mean regression, quantile regression, likelihood-based classification, and margin-based classification. Two types of covariate shift problems are the focus of this paper and the sharp convergence rates are established for a general loss function to provide a unified theoretical analysis, which concurs with the optimal results in literature where the squared loss is used. Extensive numerical studies on synthetic and real examples confirm our theoretical findings and further illustrate the effectiveness of our proposed method.

artificial intelligence, covariate shift, machine learning, (14 more...)

Country: Asia > China (0.14)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsApr-30-2026, 02:28:12 GMT

e3fea99df80195b316cefa7aa6099cd5-Paper-Conference.pdf

artificial intelligence, machine learning, stability, (18 more...)

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Neural Information Processing SystemsApr-25-2026, 20:49:32 GMT

Reducing the Covariate Shift by Mirror Samples in Cross Domain Alignment

Eliminating the covariate shift cross domains is one of the common methods to deal with the issue of domain shift in visual unsupervised domain adaptation. However, current alignment methods, especially the prototype based or sample-level based methods neglect the structural properties of the underlying distribution and even break the condition of covariate shift. To relieve the limitations and conflicts, we introduce a novel concept named (virtual) mirror, which represents the equivalent sample in another domain. The equivalent sample pairs, named mirror pairs reflect the natural correspondence of the empirical distributions. Then a mirror loss, which aligns the mirror pairs cross domains, is constructed to enhance the alignment of the domains. The proposed method does not distort the internal structure of the underlying distribution. We also provide theoretical proof that the mirror samples and mirror loss have better asymptotic properties in reducing the domain shift. By applying the virtual mirror and mirror loss to the generic unsupervised domain adaptation model, we achieved consistently superior performance on several mainstream benchmarks.

artificial intelligence, domain adaptation, machine learning, (13 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsApr-25-2026, 00:49:11 GMT

Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation

Distillation in neural networks using only the samples randomly drawn from a Gaussian distribution is possibly the most straightforward solution one can think of for the complex problem of knowledge transfer from one network (teacher) to the other (student). If successfully done, it can eliminate the requirement of teacher's training data for knowledge distillation and avoid often arising privacy concerns in sensitive applications such as healthcare. There have been some recent attempts at Gaussian noise-based data-free knowledge distillation, however, none of them offer a consistent or reliable solution. We identify the shift in the distribution of hidden layer activation as the key limiting factor, which occurs when Gaussian noise is fed to the teacher network instead of the accustomed training data. We propose a simple solution to mitigate this shift and show that for vision tasks, such as classification, it is possible to achieve a performance close to the teacher by just using the samples randomly drawn from a Gaussian distribution.

artificial intelligence, gaussian noise, machine learning, (14 more...)

Industry:

Education (0.94)
Information Technology > Security & Privacy (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)